Improved Deep Learning Baselines for Ubuntu Corpus Dialogs
نویسندگان
چکیده
This paper presents results of our experiments using the Ubuntu Dialog Corpus – the largest publicly available multi-turn dialog corpus. First, we use an in-house implementation of previously reported models to do an independent evaluation using the same data. Second, we evaluate the performances of various LSTMs, Bi-LSTMs and CNNs on the dataset. Third, we create an ensemble by averaging predictions of multiple models. The ensemble further improves the performance and it achieves a state-of-the-art result for this dataset. Finally, we discuss our future plans using this corpus.
منابع مشابه
Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus
Ubuntu dialogue corpus is the largest public available dialogue corpus to make it feasible to build end-to-end deep neural network models directly from the conversation data. One challenge of Ubuntu dialogue corpus is the large number of out-of-vocabulary words. In this paper we proposed a method which combines the general pre-trained word embedding vectors with those generated on the taskspeci...
متن کاملContent-Learning Correlations in Spoken Tutoring Dialogs at Word, Turn, and Discourse Levels
We study correlations between dialog content and learning in a corpus of human-computer tutoring dialogs. Using an online encyclopedia, we first extract domainspecific concepts discussed in our dialogs. We then extend previously studied shallow dialog metrics by incorporating content at three levels of granularity (word, turn and discourse) and also by distinguishing between students’ spoken an...
متن کاملDeep Neural Network Approach for the Dialog State Tracking Challenge
While belief tracking is known to be important in allowing statistical dialog systems to manage dialogs in a highly robust manner, until recently little attention has been given to analysing the behaviour of belief tracking techniques. The Dialogue State Tracking Challenge has allowed for such an analysis, comparing multiple belief tracking approaches on a shared task. Recent success in using d...
متن کاملTraining End-to-End Dialogue Systems with the Ubuntu Dialogue Corpus
In this paper, we analyze neural network-based dialogue systems trained in an end-to-end manner using an updated version of the recent Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words1. This dataset is interesting because of its size, long context lengths, and technical nature; thus, it can be use...
متن کاملBuilding a Corpus of Phrases Related to Learning for Sentiment Analysis
Learning-centered emotions unlike basic emotions emerge during deep learning activities and they have an important relation to cognitive processes of students. In this paper we present the creation process of a corpus of phrases (opinions) related to learning computer programming. Opinions (textual phrases), are categorized in different emotions related to learning such as frustrated, bored, ne...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1510.03753 شماره
صفحات -
تاریخ انتشار 2015